Multi-band digital compensator for a non-linear system

ABSTRACT

A pre-distorter that both accurately compensates for the non-linearities of a radio frequency transmit chain, and that imposes as few computation requirements in terms of arithmetic operations, uses a diverse set of real-valued signals that are derived from separate band signals that make up the input signal. The derived real signals are passed through configurable non-linear transformations, which may be adapted during operation, and which may be efficiently implemented using lookup tables. The outputs of the non-linear transformations serve as gain terms for a set of complex signals, which are functions of the input, and which are summed to compute the pre-distorted signal. A small set of the complex signals and derived real signals may be selected for a particular system to match the classes of non-linearities exhibited by the system, thereby providing further computational savings, and reducing complexity of adapting the pre-distortion through adapting of the non-linear transformations.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a Continuation of U.S. application Ser. No. 16/656,686, filed Oct. 18, 2019, which is a Continuation-In-Part (CIP) of U.S. application Ser. No. 16/408,979, filed May 10, 2019, which claims the benefit of U.S. Provisional Application No. 62/747,994, and U.S. Provisional Application No. 62/670,315, filed on May 11, 2018. U.S. application Ser. No. 16/656,686 is also a Continuation-In-Part (CIP) of PCT Application No. PCT/US2019/031714, which claims the benefit of U.S. Provisional Application No. 62/747,994, and U.S. Provisional Application No. 62/670,315, filed on May 11, 2018. U.S. application Ser. No. 16/656,686 also claims the benefit of U.S. Provisional Application No. 62/804,986, filed on Feb. 13, 2019. The above-referenced applications are incorporated herein by reference.

BACKGROUND

This invention relates to digital compensation of a non-linear circuit or system, for instance linearizing a non-linear power amplifier and radio transmitter chain with a multi-band input, and in particular to effective parameterization of a digital pre-distorter used for digital compensation.

One method for compensation of such a non-linear circuit is to “pre-distort” (or “pre-invert”) the input. For example, an ideal circuit outputs a desired signal u[.] unchanged (or purely scaled or modulated), such that y[.]=u[.], while the actual non-linear circuit has an input-output transformation y[.]=F (u[.]), where the notation y[.] denotes a discrete time signal. A compensation component is introduced before the non-linear circuit that transforms the input u[.], which represents the desired output, to a predistorted input v[.] according to a transformation v[.]=C(u[.]). Then this predistorted input is passed through the non-linear circuit, yielding y[.]=F(v[.]). The functional form and selectable parameters values that specify the transformation C( ) are chosen such that y[.]≈u[.] as closely as possible in a particular sense (e.g., minimizing mean squared error), thereby linearizing the operation of tandem arrangement of the pre-distorter and the non-linear circuit as well as possible.

In some examples, the DPD performs the transformation of the desired signal u[.] to the input y[.] by using delay elements to form a set of delayed versions of the desired signal (up to a maximum delay τ_(P)), and then using a non-linear polynomial function of those delayed inputs. In some examples, the non-linear function is a Volterra series:

y[n]=+x ₀+Σ_(p)Σ_(τ) ₁ _(, . . . , τ) _(p) x _(p)(τ₁, . . . τ_(P))Π_(j=1 . . . p) ^(u)[n−τ _(j)]

or

y[n]=+x ₀+Σ_(p)Σ_(τ) ₁ _(, . . . , τ) _(2p−1) x _(p)(τ₁, . . . τ_(p))Π_(j=1 . . . p) ^(u)[n−τ _(j)]Π_(j=p+1 . . . 2p−1) ^(u)[n−τ _(j)]*

In some examples, the non-linear function uses a reduced set of Volterra terms or a delay polynomial:

y[n]=+x ₀+Σ_(p)Σ_(τ) x _(p)(τ)u[n−τ]|u[n−τ| ^(p−1)).

In these cases, the particular compensation function C is determined by the values of the numerical configuration parameters x_(p) .

In the case of a radio transmitter, the desired input u[.] may be a complex discrete time baseband signal of a transmit band, and y[.] may represent that transmit band as modulated to the carrier frequency of the radio transmitter by the function F( ) that represents the radio transmit chain. That is, the radio transmitter may modulate and amplify the input v[.] to a (real continuous-time) radio frequency signal p(.) which when demodulated back to baseband, limited to the transmit band and sampled, is represented by y[.].

There is a need for a pre-distorter with a form that both accurately compensates for the non-linearities of the transmit chain, and that imposes as few computation requirements in terms of arithmetic operations to be performed to pre-distort a signal and in terms of the storage requirements of values of the configuration parameters. There is also a need for the form of the pre-distorter to be robust to variation in the parameter values and/or to variation of the characteristics of the transmit chain so that performance degradation of pre-distortion does not exceed that which may be commensurate with the degree of such variation.

In some systems, the input to a radio transmit chain is made up of separate channels occupying distinct frequency bands, generally with frequency regions separating those bands in which no transmission is desired. In such a situation, linearization of the circuit (e.g., the power amplifier) has the dual purpose of improving the linearity of the system in search of the distinct frequency bands, and reducing unwanted emissions between the bands. For example, interaction between the bands resulting from intermodulation distortion may cause such unwanted emission.

One approach to linearizing a system with a multi-band input is essentially to ignore the multi-band nature of the input. However, such an approach may require substantial computation resources, and require representation of the input signal and predistorted signal at a high sampling rate in order to capture the non-linear interactions between bands. Another approach is to linearize each band independently. However, ignoring the interaction between bands generally yields poor results. Some approaches have relaxed the independent linearization of each band by adapting coefficients of non-linear functions (e.g., polynomials) based on more than one band. However, there remains a need for improved multi-band linearization and/or reduced computation associated with such linearization.

SUMMARY

In one aspect, in general, a pre-distorter that both accurately compensates for the non-linearities of a radio frequency transmit chain, and that imposes as few computation requirements in terms of arithmetic operations and storage requirements, uses a diverse set of real-valued signals that are derived from the input signal, for example from separate band signals and their combinations, as well as optional input envelope and other relevant measurements of the system. The derived real signals are passed through configurable non-linear transformations, which may be adapted during operation based on sensed output of the transmit chain, and which may be efficiently implemented using lookup tables. The outputs of the non-linear transformations serve as gain terms for a set of complex signals, which are transformations of the input or transformations of separate bands or combinations of separate bands of the input. The gain-adjusted complex signals are summed to compute the pre-distorted signal, which is passed to the transmit chain. A small set of the complex signals and derived real signals may be selected for a particular system to match the non-linearities exhibited by the system, thereby providing further computational savings, and reducing complexity of adapting the pre-distortion through adapting of the non-linear transformations.

In another aspect, in general, a method of signal predistortion linearizes a non-linear circuit. An input signal (u) is processed to produce multiple transformed signals (w). The transformed signals are processed to produce multiple phase-invariant derived signals (r). These phase-invariant derived signals (r) are determined such that each derived signal (r_(j)) is equal to a non-linear function of one or more of the transformed signals. The derived signals are phase-invariant in the sense that a change in the phase of a transformed signal does not change the value of the derive signal. At least some of the derived signals are equal to functions of different one or more of the transformed signals. A distortion term is then formed by accumulating multiple terms. Each term is a product of a transformed signal of the transformed signals and a time-varying gain. The time-varying gain is a function (Φ) of one or more of the phase-invariant derived signals. The function of the one or more of the phase-invariant derived signals is decomposable into a combination of one or more parametric functions (ϕ) of a corresponding single one of the phase invariant derived signals (r_(j)) yielding a corresponding one of the time-varying gain components (g_(i)). An output signal (v) is determined from the distortion term and provided for application to the non-linear circuit.

In another aspect, in general, a method of signal predistortion for linearizing a non-linear circuit involves processing an input signal (u) that comprises multiple separate band signals (u₁, . . . , u_(N) _(b) ), where each separate band signal has a separate frequency range within the input frequency range of the input signal and at least part of the input frequency range contains none of the separate frequency ranges. The processing produces a set of transformed signals (w), the transformed signals including at least one transformed signal equal to a combination of multiple separate band signals. Multiple phase-invariant derived signals (r) are determined to be equal to respective non-linear functions of one or more of the transformed signals. The phase-invariant derived signals (r) are transformed according to a multiple parametric non-linear transformations (Φ) to produce a set of gain components (g). A distortion term is formed by accumulating multiple terms (indexed by k), with each term being a combination of a transformed signal (w_(a) _(k) ) of the transformed signals and respective one or more time-varying gain components (g_(i), i∈Λ_(k)) of the set of gain components. An output signal (v) determined from the distortion term is provided for application to the non-linear circuit.

Aspects may include one or more of the following features.

The non-linear circuit includes a radio-frequency section including a radio-frequency modulator configured to modulate the output signal to a carrier frequency to form a modulated signal and an amplifier for amplifying the modulated signal.

The input signal (u) includes quadrature components of a baseband signal for transmission via the radio-frequency section. For example, the input signal (u) and the transformed signals (w) comprise complex-valued signals with the real and imaginary parts of the complex signal representing the quadrature components.

The input signal (u) and the transformed signals (w) are complex-valued signals.

Processing the input signal (u) to produce the transformed signals (w) includes forming at least one of the transformed signals as a linear combination of the input signal (u) and one or more delayed versions of the input signal.

At least one of the transformed signals is formed as a linear combination includes forming a linear combination with at least one imaginary or complex multiple input signal or a delayed version of the input signal.

Forming at least one of the transformed signals, w_(k) to be a multiple of D_(α)w_(a)+j^(d)w_(b), where w_(a) and w_(b) are other of the transformed signals, and D_(α) represents a delay by α, and d is an integer between 0 and 3.

Forming the at least one of the transformed signals includes time filtering the input signal to form said transformed signal. The time filtering of the input signal includes applying a finite-impulse-response (FIR) filter to the input signal, or applying an infinite-impulse-response (IIR) filter to the input signal.

The transformed signals (w) include non-linear functions of the input signal (u).

The non-linear functions of the input signal (u) include at least one function of a form u[n−τ]|u[n−τ]|^(p) for a delay 2 and an integer power p or Π_(j=1 . . . p) ^(u)[n−τ_(j)]Π_(j=p+1 . . . 2p−1) ^(u)[n−τ_(j)]* for a set for integer delays τ₁ to τ_(2p−1), where * indicates a complex conjugate operation.

Determining a plurality of phase-invariant derived signals (r) comprises determining real-valued derived signals.

Determining the phase-invariant derived signals (r) comprises processing the transformed signals (w) to produce a plurality of phase-invariant derived signals (r).

Each of the derived signals is equal to a function of one of the transformed signals.

Processing the transformed signals (w) to produce the phase-invariant derived signals includes, for at least one derived signal (r_(p)), computing said derived signal by first computing a phase-invariant non-linear function of one of the transformed signals (w_(k)) to produce a first derived signal, and then computing a linear combination of the first derived signal and delayed versions of the first derived signal to determine at least one derived signal.

Computing a phase-invariant non-linear function of one of the transformed signals (w_(k)) comprises computing a power of a magnitude of the one of the transformed signals (|w_(k)|^(p)) for an integer power p≥1. For example, p=1 or p=2.

Computing the linear combination of the first derived signal and delayed versions of the first derived signal comprises time filtering the first derived signal. Time filtering the first derived signal can include applying a finite-impulse-response (FIR) filter to the first derived signal or applying an infinite-impulse-response (IIR) filter to the first derived signal.

Processing the transformed signals (w) to produce the phase-invariant derived signals includes computing a first signal as a phase-invariant non-linear function of a first signal of the transformed signals, and computing a second signal as a phase-invariant non-linear function of a second of the transformed signals, and then computing a combination of the first signal and the second signal to form at least one of the phase-invariant derived signals.

At least one of the phase-invariant derived signals is equal to a function for two of the transformed signals w_(a) and w_(b) with a form |w_(a)[t]|^(α)|w_(b)[t−τ]|^(β) for positive integer powers α and β.

The transformed signals (w) are processed to produce the phase-invariant derived signals by computing a derived signal r_(k)[t] using at least one of the following transformations:

-   r_(k)[t]=|w_(a)[t]|^(α), where α>0 for a transformed signal     w_(a)[t]; -   r_(k)[t]=0.5(1−θ+r_(a)[t−α]+θr_(b)[t]), where θ∈{1, −1}, a,b∈{1, . .     . , k−1}, and α is an integer, and r_(a)[t] and r_(b)[t] are other     of the derived signals; -   r_(k)[t]=r_(a)[t−α]r_(b)[t], where a,b∈{1, . . . , k−1} and α is an     integer and r_(a)[t] and r_(b)[t] are other of the derived signals;     and -   r_(k)[t]=r_(k)[t−1]+2^(−d)(r_(a)[t]−r_(k)[t−1]), where a∈{1, . . . ,     k−1} and d is an integer d>0 .

The time-varying gain components comprise complex-valued gain components.

The method includes transforming a first derived signal (r_(j)) of the plurality of phase-invariant derived signals according to one or more different parametric non-linear transformation to produce a corresponding time-varying gain components.

The one or more different parametric non-linear transformations comprises multiple different non-linear transformations producing corresponding time-varying gain components.

Each of the corresponding time-varying gain components forms a part of a different term of the plurality of terms of the sum forming the distortion term.

Forming the distortion term comprises forming a first sum of products, each term in the first sum being a product of a delayed version of the transformed signal and a second sum of a corresponding subset of the gain components.

${\delta \lbrack t\rbrack} = {\sum\limits_{k}{{w_{a_{k}}\left\lbrack {t - d_{k}} \right\rbrack}{\sum\limits_{i \in \Lambda_{k}}{g_{i}\lbrack t\rbrack}}}}$

The distortion term δ[t] has a form wherein for each term indexed by k, a_(k) selects the transformed signal, d_(k) determines the delay of said transformed signal, and Λ_(k) determines the subset of the gain components.

Transforming a first derived signal of the derived signals according to a parametric non-linear transformation comprises performing a table lookup in a data table corresponding to said transformation according to the first derived signal to determine a result of the transforming.

The parametric non-linear transformation comprises a plurality of segments, each segment corresponding to a different range of values of the first derived signal, and wherein transforming the first derived signal according to the parametric non-linear transformation comprises determining a segment of the parametric non-linear transformation from the first derived signal and accessing data from the data table corresponding to a said segment.

The parametric non-linear transformation comprises a piecewise linear or a piecewise constant transformation, and the data from the data table corresponding to the segment characterizes endpoints of said segment.

The non-linear transformation comprises a piecewise linear transformation, and transforming the first derived signal comprises interpolating a value on a linear segment of said transformation.

The method further includes adapting configuration parameters of the parametric non-linear transformation according to sensed output of the non-linear circuit.

The method further includes acquiring a sensing signal (y) dependent on an output of the non-linear circuit, and wherein adapting the configuration parameters includes adjusting said parameters according to a relationship of the sensing signal (y) and at least one of the input signal (u) and the output signal (v).

Adjusting said parameters includes reducing a mean squared value of a signal computed from the sensing signal (y) and at least one of the input signal (u) and the output signal (v) according to said parameters.

Reducing the mean squared value includes applying a stochastic gradient procedure to incrementally update the configuration parameters.

Reducing the mean squared value includes processing a time interval of the sensing signal (y) and a corresponding time interval of at least one of the input signal (u) and the output signal (v).

The method includes performing a matrix inverse of a Gramian matrix determined from the time interval of the sensing signal and a corresponding time interval of at least one of the input signal (u) and the output signal (v).

The method includes forming the Gramian matrix as a time average Gramian.

The method includes performing coordinate descent procedure based on the time interval of the sensing signal and a corresponding time interval of at least one of the input signal (u) and the output signal (v).

Transforming a first derived signal of the plurality of derived signals according to a parametric non-linear transformation comprises performing a table lookup in a data table corresponding to said transformation according to the first derived signal to determine a result of the transforming, and wherein adapting the configuration parameters comprises updating values in the data table.

The parametric non-linear transformation comprises a greater number of piecewise linear segments than adjustable parameters characterizing said transformation.

The non-linear transformation represents a function that is a sum of scaled kernels, a magnitude scaling each kernel being determined by a different one of the adjustable parameters characterizing said transformation.

Each kernel comprises a piecewise linear function.

Each kernel is zero for at least some range of values of the derived signal.

The parametric non-linear transformations are adapted according to measured characteristics of the non-linear circuit.

The transformed signals include a degree-1 combination of the separate band signals.

The transformed signals include a degree-2 or a degree-0 combination of the separate band signals.

Each derived signal (r_(j)) of the derived signals is equal to a non-linear function of a respective subset of one or more of the transformed signals, and at least some of the derived signals are equal to functions of different one or more of the transformed signals.

One or more of the derived signal (r_(j)) of the phase-invariant derived signals are transformed according to respective one or more parametric non-linear transformations (ϕ_(i, j)) to produce a time-varying gain component (g_(i)) of a plurality of gain components (g).

Each of the parametric non-linear transformations (Φ) is decomposable into a combination of one or more parametric functions (ϕ) of a corresponding single one of the derived signals (r_(j)).

The input signal (u) is filtered (e.g., time domain filtered) to form the plurality of separated band signals (u₁, . . . , u_(N) _(b) ). Alternatively, the separate band signals are directly provided as input rather than the overall input signal (u).

Each of the separated band signals is represented at a same sampling rate as the input signal.

The processing of the input signal (u) to produce a plurality of transformed signals (w) includes forming at least some of the transformed signals as combinations of subsets of the separate band signals or signals derived from said separate band signals.

The combinations of subsets of the separate band signals or signals derived from said separate band signals make use of delay, multiplication, and complex conjugate operations on the separate band signals.

Processing the input signal (u) to produce the plurality of transformed signals (w) includes scaling a magnitude of a separate band signal according to an overall power of the input signal (r₀).

Processing the input signal (u) to produce the plurality of transformed signals (w) includes raising a magnitude of a separate band signal to a first exponent (α) and rotating a phase of said band signal according to a second exponent (β) not equal to the first exponent.

Processing the input signal (u) to produce the plurality of transformed signals (w) includes forming at least one of the transformed signals as a multiplicative combination of one of the separate band signals (u_(a)) and a delayed version of another of the separate band signals (u_(b)).

Forming at least one of the transformed signals as a linear combination includes forming a linear combination with at least one imaginary or complex multiple input signal or a delayed version of the input signal.

At least one of the transformed signals, w_(k), is formed to be a multiple of D₆₀ w_(a)+j^(d)w_(b), where w_(a) and w_(b) are other of the transformed signals each of which depend on only a single one of the separate band signals, and D_(α) represents a delay by α, and d is an integer between 0 and 3.

In another aspect, in general, a digital predistorter circuit is configured to perform all the steps of any of the methods set forth above.

In another aspect, in general, a design structure is encoded on a non-transitory machine-readable medium. The design structure comprises elements that, when processed in a computer-aided design system, generate a machine-executable representation of the digital predistortion circuit that is configured to perform all the steps of any of the methods set forth above.

In another aspect, in general, a non-transitory computer readable media is programmed with a set of computer instructions executable on a processor. When these instructions are executed, they cause operations including all the steps of any of the methods set forth above.

DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram of a radio transmitter.

FIG. 2 is a block diagram of the pre-distorter of FIG. 1.

FIG. 3 is a block diagram of a distortion signal combiner of FIG. 2.

FIGS. 4A-E are graphs of example gain functions.

FIG. 5 is a diagram of a table-lookup implementation of a gain lookup section of FIG. 2.

FIG. 6A-B are diagrams of a section of a table lookup for piecewise linear functions.

FIG. 7A is a frequency plot of a two-band example with high-order intermodulation distortion terms.

FIG. 7B is a frequency plot of an input signal corresponding to FIG. 7A.

FIG. 7C is a frequency plot of a distortion signal corresponding to FIG. 7B.

FIG. 8 is a plot of a sampled carrier signal.

DESCRIPTION

Referring to FIG. 1, in an exemplary structure of a radio transmitter 100, a desired baseband input signal u[.] passes to a baseband section 110, producing a predistorted signal v[.]. In the description below, unless otherwise indicated, signals such as u[.] and v[.] are described as complex-valued signals, with the real and imaginary parts of the signals representing the in-phase and quadrature terms (i.e., the quadrature components) of the signal. The predistorted signal v[.] then passes through a radio frequency (RF) section 140 to produce an RF signal p(.), which then drives a transmit antenna 150. In this example, the output signal is monitored (e.g., continuously or from time to time) via a coupler 152, which drives an adaptation section 160. The adaptation section also receives the input to the RF section, v[.]. The adaptation section 150 determined values of parameters x, which are passed to the baseband section 110, and which affect the transformation from u[.] to v[.] implemented by that section.

The structure of the radio transmitter 100 shown in FIG. 1 includes an optional envelope tracking aspect, which is used to control the power (e.g., the voltage) supplied to a power amplifier of the RF section 140, such that less power is provided when the input u[.] has smaller magnitude over a short term and more power is provided when it has larger magnitude. When such an aspect is included, an envelope signal e[.] is provided from the baseband section 110 to the RF section 140, and may also be provided to the adaptation section 160.

The baseband section 110 has a predistorter 130, which implements the transformation from the baseband input u[.] to the input v[.] to the RF section 140. This predistorter is configured with the values of the configuration parameters x provided by the adaptation section 160 if such adaptation is provided. Alternatively, the parameter values are set when the transmitter is initially tested, or may be selected based on operating conditions, for example, as generally described in U.S. Pat. 9,590,668, “Digital Compensator.”

In examples that include an envelope-tracking aspect, the baseband section 110 includes an envelope tracker 120, which generates the envelope signal e[.]. For example, this signal tracks the magnitude of the input baseband signal, possibly filtered in the time domain to smooth the envelope. In particular, the values of the envelope signal may be in the range [0,1], representing the fraction of a full range. In some examples, there are N_(E) such components of the signal (i.e., e[.]=(e₁[], . . . , e_(N) _(E) [.]), for example, with e₁[.] may be a conventional envelope signal, and the other components may be other signals, such as environmental measurements, clock measurements (e.g., the time since the last “on” switch, such as a ramp signal synchronized with time-division-multiplex (TDM) intervals), or other user monitoring signals. This envelope signal is optionally provided to the predistorter 130. Because the envelope signal may be provided to the RF section, thereby controlling power provided to a power amplifier, and because the power provided may change the non-linear characteristics of the RF section, in at least some examples, the transformation implemented by the predistorter depends on the envelope signal.

Turning to the RF section 140, the predistorted baseband signal v[.] passes through an RF signal generator 142, which modulates the signal to the target radio frequency band at a center frequency f_(c). This radio frequency signal passes through a power amplifier (PA) 148 to produce the antenna driving signal p(.). In the illustrated example, the power amplifier is powered at a supply voltage determined by an envelope conditioner 122, which receives the envelope signal ell and outputs a time-varying supply voltage V_(c) to the power amplifier.

As introduced above, the predistorter 130 is configured with a set of fixed parameters z, and values of a set of adaptation parameters x, which in the illustrated embodiment are determined by the adaptation section 160. Very generally, the fixed parameters determine the family of compensation functions that may be implemented by the predistorter, and the adaptation parameters determine the particular function that is used. The adaptation section 160 receives a sensing of the signal passing between the power amplifier 148 and the antenna 150, for example, with a signal sensor 152 preferably near the antenna (i.e., after the RF signal path between the power amplifier and the antenna, in order to capture non-linear characteristics of the passive signal path). RF sensor circuity 164 demodulates the sensed signal to produce a representation of the signal band y[.], which is passed to an adapter 162. The adapter 162 essentially uses the inputs to the RF section, namely v[.] and/or the input to the predistorter u[.] (e.g., according to the adaptation approach implemented) and optionally e[.], and the representation of sensed output of the RF section, namely y[.]. In the analysis below, the RF section is treated as implementing a generally non-linear transformation represented as y[.]=F(v[.], e[.]) in the baseband domain, with a sampling rate sufficiently large to capture not only the bandwidth of the original signal u[.] but also a somewhat extended bandwidth to include significant non-linear components that may have frequencies outside the desired transmission band. In later discussions below, the sampling rate of the discrete time signals in the baseband section 110 is denoted as f_(s).

In the adapter 162 is illustrated in FIG. 1 and described below as essentially receiving u[t] and/or v[t] synchronized with y[t]. However, there is a delay in the signal path from the input to the RF section 140 to the output of the RF sensor 164. Therefore, a synchronization section (not illustrated) may be used to account for the delay, and optionally to adapt to changes in the delay. For example, the signals are upsampled and correlated, thereby yielding a fractional sample delay compensation, which may be applied to one or the other signal before processing in the adaptation section. Another example of a synchronizer is described in U.S. Pat. No. 10,141,961, which is incorporated herein by reference.

Although various structures for the transformation implemented by the predistorter 130 may be used, in one or more embodiments described below, the functional form implemented is

v[.]=u[.]+δ[.]

where

δ[.]=Δ(u[.], e[.]),

and Δ(,), which may be referred to as the distortion term, is effectively parameterized by the parameters x. Rather than using a set of terms as outlined above for the Volterra or delay polynomial approaches, the present approach makes use of a multiple stage approach in which a diverse set of targeted distortion terms are combined in a manner that satisfies the requirements of low computation requirement, low storage requirement, and robustness, while achieving a high degree of linearization.

Very generally, structure of the function Δ(,) is motivated by application of the Kolmogorov Superposition Theorem (KST). One statement of KST is that a non-linear function of d arguments x₁, . . . , x_(d)∈[0,1]^(d) may be expressed as

$\sum\limits_{i = 1}^{{2d} + 1}{g_{i}\left( {\sum\limits_{j = 1}^{d}{h_{ij}\left( x_{j} \right)}} \right)}$

for some functions g_(i) and h_(ij). Proofs of the existence of such functions may concentrate on particular types of non-linear functions, for example, fixing the h_(ij) and proving the existence of suitable g_(i) . In application to approaches described in this document, this motivation yields a class of non-linear functions defined by constituent non-linear functions somewhat analogous to the g_(i) and/or the h_(ij) in the KST formulation above.

Referring to FIG. 2, the predistorter 130 performs a series of transformations that generate a diverse set of building blocks for forming the distortion term using an efficient table-driven combination. As a first transformation, the predistorter includes a complex transformation component 210, labelled L_(C) and also referred to as the “complex layer.”

Generally, the complex layer receives the input signal, and outputs multiple transformed signals. In the present embodiment, the input to the complex transformation component is the complex input baseband signal, u[.], and the output is a set of complex baseband signals, w[.], which may be represented as a vector of signals and indexed w₁[.], w₂[.], . . . , w_(N) _(W) [.], where N_(W) is the number of such signals. Very generally, these complex baseband signals form terms for constructing the distortion term. More specifically, the distortion term is constructed as a weighted summation of the set of baseband signals, where the weighting is time varying, and determined based on both the inputs to the predistorter 130, u[.] and e[.], as well as the values of the configuration parameters, x. Going forward, the denotation of signals with “[.]” is omitted, and the context should make evident when the signal as a whole is referenced versus a particular sample.

Note that as illustrated in FIG. 2, the complex layer 210 is configured with values of fixed parameters z, but does not depend of the adaptation parameters x. For example, the fixed parameters are chosen according to the type of RF section 140 being linearized, and the fixed parameters determine the number N_(W) of the complex signals generated, and their definition.

In one implementation, the set of complex baseband signals includes the input itself, w₁=u, as well as well as various delays of that signal, for example, w_(k)=u[t−k+1] for k=1, . . . , N_(W). In another implementation, the complex signals output from the complex layer are arithmetic functions of the input, for example

(u[t]+u[t−1])/2;

(u[t]+ju[t−1])/2; and

(u[t]+ju[t−1])/2+u[t−2])/2.

In at least some examples, these arithmetic functions are selected to limit the needed computational resources by having primarily additive operations and multiplicative operations by constants that may be implemented efficiently (e.g., division by 2). In another implementation, a set of relatively short finite-impulse-response (FIR) filters modify the input u[t] to yield w_(k)[t], where the coefficients may be selected according to time constants and resonance frequencies of the RF section.

In yet another implementation, the set of complex baseband signals includes the input itself, w₁=u, as well as well as various combinations, for example, of the form

w _(k)=0.5(D _(α) w _(a) +j ^(d) w _(b)),

where D_(α) represents a delay of a signal by an integer number α samples, and d is an integer, generally with d∈{0,1,2,3} may depend on k, and k>a,b (i.e., each signal w_(k) may be defined in terms of previously defined signals), such that

w _(k)[t]=0.5(w _(a)[t−α]+j ^(d) w _(b)[t]).

There are various ways of choosing which combinations of signals (e.g., the a,b,d values) determine the signals constructed. One way is essentially by trial and error, for example, adding signals from a set of values in a predetermined range that most improve performance in a greedy manner (e.g., by a directed search) one by one.

Continuing to refer to FIG. 2, a second stage is a real transformation component 220, labelled L_(R) and also referred to as the “real layer.” The real transformation component receives the N_(W) signals w, optionally as well as the envelope signal e, and outputs N_(R) (generally greater than N_(W)) real signals r, in a bounded range, in this implementation in a range [0,1]. In some implementations, the real signals are scaled, for example, based on a fixed scale factor that is based on the expected level of the input signal u. In some implementations, the fixed parameters for the system may include a scale (and optionally an offset) in order to achieve a typical range of [0,1]. In yet other implementations, the scale factors may be adapted to maintain the real values in the desired range.

In one implementation, each of the complex signals w_(k) passes to one or more corresponding non-linear functions f(w), which accepts a complex value and outputs a real value r that does not depend on the phase of its input (i.e., the function is phase-invariant). Examples of these non-linear functions, with an input u=u_(re)+ju_(im) include following:

|w|=|w _(re) +jw _(im)|=(w _(re) ² +w _(im) ²)^(1/2);

ww*=|w| ²;

log(a+ww*); and

|w|^(1/2).

In at least some examples, the non-linear function is monotone or non-decreasing in norm (e.g., an increase in |w| corresponds to an increase in r=f(u)).

In some implementations, the output of a non-linear, phase-invariant function may be filtered, for example, with a real linear time-invariant filters. In some examples, each of these filters is an Infinite Impulse-Response (IIR) filter implemented as having a rational polynomial Laplace or Z Transform (i.e., characterized by the locations of the poles and zeros of the Transform of the transfer function). An example of a Z transform for an IIR filter is:

$\frac{Y(z)}{X(z)} = \frac{z - q}{z^{2} - {2qz} + p}$

where, for example, p=0.7105 and q=0.8018. In other examples, a Finite Impulse-Response (FIR). An example of a FIR filter with input x and output y is:

${{y\left\lbrack {n + 1} \right\rbrack} = {\sum\limits_{\tau}{\left( {1 - 2^{- k}} \right)^{\tau}{x\left\lbrack {n - \tau} \right\rbrack}}}},$

for example with k=1 or k=4 .

In yet another implementation, the particular signals are chosen (e.g., by trial and error, in a directed search, iterative optimization, etc.) from one or more of the following families of signals:

-   a. r_(k)=e_(k) for k=1, . . . , N_(E), where e₁, . . . ,e_(N) _(E)     are the optional components of signal e; -   b. r_(k)[t]=|w_(a)[t]|^(α) for all t, where α>0 (with α=1 or α=2     being most common) and a∈{1, . . . , N_(W)} may depend on k; -   c. r_(k)[t]=0.5(1−θ+r_(a)[t−α]+θr_(b)[t]) for all t, where θ∈{1,     −1}, a,b∈{1, . . . , k−1}, and α is an integer that may depend on k; -   d. r_(k)[t]=r_(a)[t−α]r_(b)[t] for all t, where a,b∈{1, . . . , k−1}     and α is an integer that may depend on k; -   e. r_(k)[t]=r_(k)[t−1]+2^(−d)(r_(a)[t]−r_(k)[t−1]) for all t, where     a∈{1, . . . , k−1} and integer d, d>0, may depend on k     (equivalently, r_(k) is the response of a first order linear time     invariant (LTI) filter with a pole at 1-2^(−d), applied to r_(a) for     some a<k; -   f. r_(k) is the response (appropriately scaled and centered) of a     second order LTI filter with complex poles (carefully selected for     easy implementability), applied to r_(a) for some a∈{1, . . . ,     k−1}.

As illustrated in FIG. 2, the real layer 220 is configured by the fixed parameters z, which determine the number of real signals N_(R), and their definition. However, as with the complex layer 210, the real layer does not depend on the adaptation parameters x. The choice of real functions may depend on characteristics of the RF section 140 in a general sense, for example, being selected based on manufacturing or design-time considerations, but these functions do not generally change during operation of the system while the adaptation parameters x may be updated on an ongoing basis in at least some implementations.

According to construction (a), the components of e are automatically treated as real signals (i.e., the components of r). Construction (b) presents a convenient way of converting complex signals to real ones while assuring that scaling the input u by a complex constant with unit absolute value does not change the outcome (i.e., phase-invariance). Constructions (c) and (d) allow addition, subtraction, and (if needed) multiplication of real signals. Construction (e) allows averaging (i.e., cheaply implemented low-pass filtering) of real signals and construction (f) offers more advanced spectral shaping, which is needed for some real-world power amplifiers 148, which may exhibit a second order resonance behavior. Note that more generally, the transformations producing the r components are phase invariant in the original baseband input u, that is, multiplication of u[t] by exp(jθ) or exp(jωt) does not change r_(p)[t].

Constructing the signals w and r can provide a diversity of signals from which the distortion term may be formed using a parameterized transformation. In some implementations, the form of the transformation is as follows:

${{\delta \lbrack t\rbrack} = {\sum\limits_{k}{{w_{a_{k}}\left\lbrack {t - d_{k}} \right\rbrack}{\Phi_{k}^{(x)}\left( {r\lbrack t\rbrack} \right)}}}}.$

The function Φ_(k) ^((x))(r) takes as an argument the N_(R) components of r, and maps those values to a complex number according to the parameters values of x. That is, each function Φ_(k) ^((x))(r) essentially provides a time-varying complex gain for the k^(th) term in the summation forming the distortion term. With up to D delays (i.e., 0≤d_(k), D) and N_(W) different w[t] functions, there are up to N_(W)D terms in the sum. The selection of the particular terms (i.e., the values of a_(k) and d_(k)) is represented in the fixed parameters z that configure the system.

Rather than configuring functions of N_(R) arguments, some embodiments structure the Φ_(k) ^((x))(r) functions as a summation of functions of single arguments as follows:

${\Phi_{k}^{(x)}\left( {r\lbrack t\rbrack} \right)} = {\sum\limits_{j}{\varphi_{k,j}\left( {r_{j}\lbrack t\rbrack} \right)}}$

where the summation over j may include all N_(R) terms, or may omit certain terms. Overall, the distortion term is therefore computed to result in the following:

${\delta \lbrack t\rbrack} = {\sum\limits_{k}{{w_{a_{k}}\left\lbrack {t - d_{k}} \right\rbrack}{\sum\limits_{j}{{\varphi_{k,j}\left( {r_{j}\lbrack t\rbrack} \right)}.}}}}$

Again, the summation over j may omit certain terms, for example, as chosen by the designer according to their know-how and other experience or experimental measurements. This transformation is implemented by the combination stage 230, labelled L_(R) in FIG. 2. Each term in the sum over k uses a different combination of a selection of a component a_(k) of w and a delay d_(k) for that component. The sum over j yields a complex multiplier for that combination, essentially functioning as a time-varying gain for that combination.

As an example of one term in summation that yields the distortion term, consider w₁=u, and r=|u|² (i.e., applying transformation (b) with a=1, and α=2), which together yield a term of the form uϕ(|u|²) where ϕ( ) is one of the parameterized scalar functions. Note the contrast of such a term as compared to a simple scalar weighting of a terms u|u|², which lack the larger number of degrees of freedom obtainable though the parameterization of ϕ( ).

Each function ϕ_(k,j)(r_(j)) implements a parameterized mapping from the real argument r_(j), which is in the range [0,1], to a complex number, optionally limited to complex numbers with magnitudes less than or equal to one. These functions are essentially parameterized by the parameters x, which are determined by the adaptation section 160 (see FIG. 1). In principal, if there are N_(W) components of w, and delays from 0 to D−1 are permitted, and each component of the N_(R) components of r may be used, then there may be up to a total of N_(W)·D·N_(R) different functions ϕ_(k, j)( ).

In practice, a selection of a subset of these terms are used, being selected for instance by trial-and-error or greedy selection. In an example of a greedy iterative selection procedure, a number of possible terms (e.g., w and r combinations) are evaluated according to their usefulness in reducing a measure of distortion (e.g., peak or average RMS error, impact on EVM, etc. on a sample data set) at an iteration and one or possible more best terms are retained before proceeding to the next iteration where further terms may be selected, with a stopping rule, such as a maximum number of terms or a threshold on the reduction of the distortion measure. A result is that for any term k in the sum, only a subset of the N_(R) components of r are generally used. For a highly nonlinear device, a design generally works better employing a variety of r_(k) signals. For nonlinear systems with strong memory effect (i.e., poor harmonic frequency response), the design tends to require more shifts in the w_(k) signals. In an alternative selection approach, the best choices of w_(k) and r_(k) with given constraints starts with a universal compensator model which has a rich selection of w_(k) and r_(k), and then an L1 trimming is used to restrict the terms.

Referring to FIG. 4A, one functional form for the ϕ_(k, j)(r_(j)) functions, generically referred to as ϕ(r), is as a piecewise constant function 410. In FIG. 4A, the real part of such a piecewise constant function is shown in which the interval from 0.0 to 1.0 is divided into 8 section (i.e., 2^(S) sections for S=3). In embodiments that use such form, the adaptive parameters x directly represent the values of these piecewise constant sections 411, 412-418. In FIG. 4A, and in examples below, the r axis is divided in regular intervals, in the figure in equal width intervals. The approaches described herein do not necessarily depend on uniform intervals, and the axis may be divided in unequal intervals, with all functions using the same set of intervals or different functions potentially using different intervals. In some implementations, the intervals are determined by the fixed parameters z of the system.

Referring to FIG. 4B, another form of function is a piecewise linear function 420. Each section 431-438 is linear and is defined by the values of its endpoints. Therefore, the function 420 is defined by the 9 (i.e., 2^(S)+1) endpoints. The function 420 can also be considered to be the weighted sum of predefined kernels b_(l)(r) for l=0, . . . , L−1, in this illustrated case with L=2^(S)+1=9. In particular, these kernels may be defined as:

$\mspace{20mu} {{b_{0}(r)} = \left\{ {\begin{matrix} {1 - {rL}} & {{{for}\mspace{14mu} 0} \leq r \leq {1\text{/}L}} \\ 0 & {otherwise} \end{matrix},{{b_{i}(r)} = \left\{ {{\begin{matrix} {1 + {\left( {r - {i\text{/}L}} \right)L}} & {{{for}\mspace{14mu} \left( {i - 1} \right)\text{/}L} \leq r \leq {i\text{/}L}} \\ {1 - {\left( {r - {i\text{/}L}} \right)L}} & {\ {{{{for}\mspace{14mu} i\text{/}L} \leq r \leq {\left( {i + 1} \right)\text{/}L}}\ ,{{{for}\mspace{14mu} 0} < i < L}\ ,{and}}} \\ 0 & {otherwise} \end{matrix}\mspace{20mu} {b_{L}(r)}} = \left\{ {\begin{matrix} {{1 + {\left( {r - 1} \right)L}}\mspace{14mu}} & {{{for}\mspace{14mu} \left( {L - 1} \right)\text{/}L} \leq r \leq {1\text{/}L}} \\ 0 & {otherwise} \end{matrix}.} \right.} \right.}} \right.}$

The function 420 is then effectively defined by the weighted sum of these kernels as:

${f(r)} = {\sum\limits_{l = 1}^{L}{x_{l}{b_{l}(r)}}}$

where the x_(l) are the values at the endpoints of the linear segments.

Referring to FIG. 4C, different kernels may be used. For example, a smooth function 440 may be defined as the summation of weighted kernels 441, 442-449. In some examples, the kernels are non-zero over a restricted range of values of r, for example, with b_(l)(r) being zero for r outside [(i−n)/L, (i+n)/L] for n=1, or some large value of n<L.

Referring to FIG. 4D, in some examples, piecewise linear function forms an approximation of a smooth function. In the example shown in FIG. 4D, a smooth function, such as the function in FIG. 4C, is defined by 9 values, the multiplier for kernel functions b₀ through b₉. This smooth function is then approximated by a larger number of linear sections 451-466, in this case 16 section defined by 17 endpoints. 470, 471-486. As is discussed below, this results in there being 9 (complex) parameters to estimate, which are then transformed to 17 parameters for configuring the predistorter. Of course, different number of estimated parameters and linear sections may be used. For example, 4 smooth kernels may be used in estimation and then 32 linear sections may be used in the runtime predistorter.

Referring to FIG. 4E, in another example, the kernel functions themselves are piecewise linear. In this example, 9 kernel functions, of which two 491 and 492 are illustrated, are used. Because the kernels have linear segments of length 1/16, the summation of the 9 kernel functions result in a function 490 that has 16 linear segments. One way to form the kernel functions is a 1/M^(th) band interpolation filter, in this illustration a half-band filter. In another example that is not illustrated, 5 kernels can be used to generate the 16-segment function essentially by using quarter-band interpolation filters. The specific form of the kernels may be determined by other approaches, for example, to optimize smoothness or frequency content of the resulting functions, for example, using linear programming of finite-impulse-response filter design techniques.

It should also be understood that the approximation shown in FIGS. 4D-E do not have to be linear. For example, a low-order spline may be used to approximate the smooth function, with fixed knot locations (e.g., equally spaced along the r axis, or with knots located with unequal spacing and/or at locations determined during the adaptation process, for example, to optimize a degree of fit of the splines to the smooth function.

Referring to FIG. 3, the combination stage 230 is implemented in two parts: a lookup table stage 330, and a modulation stage 340. The lookup table stage 330, labelled L_(T), implements a mapping from the N_(R) components of r to N_(G) components of a complex vector g. Each component g_(i) corresponds to a unique function ϕ_(k, j) used in the summation shown above. The components of g corresponding to a particular term k have indices i in a set denoted Λ_(k). Therefore, the combination sum may be written as follows:

${{\delta \lbrack t\rbrack} = {\sum\limits_{k}{{w_{a_{k}}\left\lbrack {t - d_{k}} \right\rbrack}{\sum\limits_{i \in \Lambda_{k}}{g_{i}\lbrack t\rbrack}}}}}.$

This summation is implemented in the modulation stage 340 shown in FIG. 3. As introduced above, the values of the a_(k), d_(k), and Λ_(k) are encoded in the fixed parameters z.

Note that the parameterization of the predistorter 130 (see FIG. 1) is focused on the specification of the functions ϕ_(k, j)( ). In a preferred embodiment, these functions are implemented in the lookup table stage 330. The other parts of the predistorter, including the selection of the particular components of w that are formed in the complex transformation component 210, the particular components of r that are formed in the real transformation component 220, and the selection of the particular functions ϕ_(k, j)( ) that are combined in the combination stage 230, are fixed and do not depend on the values of the adaptation parameters x. Therefore, in at least some embodiments, these fixed parts may be implemented in fixed dedicated circuitry (i.e., “hardwired”), with only the parameters of the functions being adapted by writing to storage locations of those parameters.

One efficient approach to implementing the lookup table stage 330 is to restrict each of the functions ϕ_(k, j)( ) to have a piecewise constant or piecewise linear form. Because the argument to each of these functions is one of the components of r, the argument range is restricted to [0,1], the range can be divided into 2^(s) sections, for example, 2^(s) equal sized sections with boundaries at i2^(−s) for i∈{0,1, . . . , 2^(s)}. In the case of piecewise constant function, the function can be represented in a table with 2^(s) complex values, such that evaluating the function for a particular value of r_(j) involves retrieving one of the values. In the case of piecewise linear functions, a table with 1+2^(s) values can represent the function, such that evaluating the function for a particular value of r_(j) involves retrieving two values from the table for the boundaries of the section that r_(j) is within, and appropriately linearly interpolating the retrieved values.

Referring to FIG. 5, one implementation of the lookup table stage 330, in this illustration for piecewise constant functions, makes use of a set of tables (or parts of one table) 510-512. Table 510 has one row for each function ϕ_(k,1)(r₁), table 511 has one row for each function ϕ_(k,2)(r₂), and so forth. That is, each row represents the endpoints of the linear segments of the piecewise linear form of the function. In such an arrangement, each of the tables 510-512 will in general have a different number of rows. Also, it should be understood that such an arrangement of separate tables is logical, and the implemented data structures may be different, for example, with a separate array of endpoint values for each function, not necessarily arranged in tables as shown in FIG. 5. To implement the mapping from r to g, each element r_(j) is used to select a corresponding column in the j^(th) table, and the values in that column are retrieved to form a portion of g. For example, the r₁ ^(th) column 520 is selected for the first table 410, and the values in that column are retrieved as g₁, g₂, . . . . This process is repeated for the r₂ ^(nd) column 421 of table 511, the r₃ ^(rd) column 522 of table 512 and so forth to determine all the component values of g. In an embodiment in which piecewise linear functions are used, two columns may be retrieved, and the values in the columns are linearly interpolated to form the corresponding section of g. It should be understood that the table structure illustrated in FIG. 5 is only one example, and that other analogous data structures may be used within the general approach of using lookup tables rather than extensive use of arithmetic functions to evaluate the functions ϕ_(k, j)( ). It should be recognized that while the input r_(p) is real, the output g_(i) is complex. Therefore, the cells of the table can be considered to hold pairs of values for the real and imaginary parts of the output, respectively.

The lookup table approach can be applied to piecewise linear function, as illustrated in FIG. 6A for one representative transformation g_(k)=ϕ(r_(p)). The value r_(p) is first processed in a quantizer 630, which determines which segment r_(p) falls on, and output m_(p) representing that segment. The quantizer also output a “fractional” part f_(p), which represents the location of r_(p) in the interval for that segment. Each cell in the column 621 identified by m_(p) has two quantities, which essentially define one endpoint and the slope of the segment. The slope is multiplied in a multiplier 632 by the fractional part f_(p), and the product is added in an adder 634 to yield the value g_(k). Of course this is only one implementation, and different arrangements of the values stored in the table 611, or in multiple tables, and the arrangement of the arithmetic operators on selected values from the table to yield the value g may be used. FIG. 6B shows another arrangement for use with piecewise linear functions. In this arrangement, the output m_(p) selects two adjacent columns of the table, which represent the two endpoint values. Such an arrangement reduces the storage by a factor of two as compared to the arrangement of FIG. 6A. However, because the slope of the linear segments are not stored, an adder 635 is used to take the difference between the endpoint values, and then this difference is multiplied by f_(p) and added to one of the endpoint values in the manner of FIG. 6A.

In the description above, the input u[.] is processed as a whole, without necessarily considering any multiple band structure in the signal in computation of a distortion term g[.] from which a predistorted output v[.]=u[.]+δ[.] is computed. In the following description, we assume that there are N_(b) spectrally distinct bands, which together occupy only a part of the available bandwidth generally, and that the input can be decomposed as a sum to spectrally distinct signals as

u[.]=u ₁[.]+u ₂[.]+ . . . +u _(N) _(b) [.].

The techniques described above may be used in combination with the further techniques described below targeting the multi-band nature of the input. That is, the multi-band techniques extend the single-band techniques and essentially extend them for application to multi-band input.

In this embodiment, the sampling rate of the input signal is maintained in each of the band signals, such that individually each of these band signals are oversampled because each of the distinct bands occupies only a fraction of the original bandwidth. However, as described below, the approach makes use of complex combinations of these band signals, and after such combinations a higher sampling rate is needed to represent the combinations as compared to the individual band signals. Therefore, although in alternative embodiments it is possible to down sample the band signals, and potentially represent their complex combinations at sampling rates below the sampling rate of the overall signal, the computational overhead and complexity of the down and up sampling does not warrant any reduction in underlying computation.

In one approach to processing, the multiple band input uses essentially the same structure as shown in FIG. 2, which is used in the single-band case. In particular, the complex transformation component 210, labelled L_(C) and referred to as the “complex layer,” receives the complex input baseband signal, u[.], and decomposes it, for example, by bandpass filtering, into a set of band signals (u₁[.], u₂[.], . . . , u_(N) _(b) [.]) and then outputs a set of complex baseband signals, w[.], where each of these baseband signals is determined from a subset of one or more of the band signals, u,[.], with the output baseband signals again being represented as a vector of signals and indexed w₁[.], w₂[.], . . . , w_(N) _(W) [.], where N_(W) is the number of such signals.

In the multiple band case, the output signals may be computed in a number of ways, including by applying one or more of the following constructions, without limitation:

-   a. w_(k)=u_(a)r₀ ^(−α) for some a∈{1, . . . , N_(b)} and α∈(0,1),     where u_(a) is the a^(th) band, and r₀=|u₁|²+ . . . +|u_(N) _(b) |²) -   b. w_(k)=w*_(a) (i.e., complex conjugate) for some k>N_(b)+1, where     the parameter a∈{1, . . . , k−1} may depend on k -   c. w_(k)=w_(a)(D_(α)w_(b)) for some k>N_(b)+1, where the integer     parameters a, b∈{1, . . . , k−1} and α may depend on k -   d. w_(k)=|w_(a)|^(α)e^(jβ∠w) ^(a) for some k>N_(b)+1, where the     integer parameters a∈{1, . . . , k−1} and β, and the real parameter     α>0 may depend on k. This construction may be referred to as a α,     β)-rotation function, which for α=β reduces to a power (i.e.,     exponent) function.

Note that construction (a) depends on a single band signal u_(a) (possibly scaled by an overall power). The construction (c) may introduce “cross-terms”, and repeated application of that construction, along with intervening other of the constructions, can be used to generate a wide variety of cross-terms, which may be associated with particular distortion components. Furthermore, other constructions in addition to or instead of those shown above may be used, including constructions described above for the signal-band case. For example, within-band constructions that are analogous to those used in the single-band case can be used, such that w_(k)=0.5(D_(α)w_(a)+j^(d)w_(b)), with the added constraint that both w_(a) and w_(b) depend on only a single band signal u_(i) (as is implicitly the case in the single-band case).

Therefore, one can consider the resulting set of complex signals w_(k) as including, for each of the band signal u_(a), a subset of the w_(k) that depends only on that band signal, which can include that band signal unmodified, as well as processed versions of the signal including products of delayed versions, complex conjugates, powers, etc. of other signals in the subset, as well as power-scaled versions based on overall power of the input signal. The resulting set of complex signals w_(k) then further includes a “cross-product” subset, which includes complex combinations of two or more band signals, for example, resulting from application of construction (c).

It should be recognized that for each of the separate bands, the multi-band approach described above retains the power of linearization within the band, for example, based on the subset of complex signals that depend only on the input in that band using the structure described above for the single-band case. More generally, the approaches and constructions described above for the single-band case may be combined with the approaches described here for the multi-band case. The multi-and approach further adds the capability of addressing cross terms involving two or more bands, and effects of overall power over multiple or all of the bands. An intention of operations in the complex layer is to generate complex signals which correspond to harmonics or other expected distortion components that arise from the individual bands contained in the baseband input signal u.

One way to accomplish this goal of the resulting signals having harmonics in the baseband is to only use what are referred to herein as “degree 1” harmonics. A degree-1 term is defined as a signal that falls at a frequency position within the baseband that is insensitive to the carrier frequency f_(c). to which the baseband signal u is ultimately modulated for radio-frequency transmission. Note that, for example, construction (c) for computing the w signals of the form w_(k)=w_(a)(D_(α)w_(b)), in combination with construction (b) w_(k)=w*_(a), can be used to yield derived signals for a form

u ₁[t]u ₁[t−1]u* ₂[t].

More specifically, the degree of a signal w_(k), which is constructed as a combination of a set of signals (e.g., from the band signals u_(i)), is defined according to rules corresponding to the construction rules presented above: each complex signal introduced according to (a) is assigned degree 1; if w_(k) is defined via w_(a) according to construction (b), the degree of w_(k) is minus the degree of w_(a); if w_(k) is defined via w_(a) and w_(b) according to construction (c), the degree of w_(k) is the sum of degrees of w_(a) and w_(b); and if w_(k) is defined via w_(a) according to construction (d), the degree of w_(k) is the degrees of w_(a) times β.

As in the single-band case, the generated complex signals are passed to the second stage, the real transformation component 220, labelled L_(R) and also referred to as the “real layer.” The real transformation component receives the N_(W) signals w, as well as the real “envelope” signal(s) e, and outputs N_(R) (generally greater than N_(W)) real signals r, in a bounded range, in the implementation in a range [0,1]. In one implementation for the multiple band case, the particular signals are chosen from one or more of the following families of signals resulting for sequential (i.e., k=1, 2, . . . ) application of constructions selected from the following, without limitation:

-   a. r_(k)=e_(k) for k=1, . . . , N_(E), where e₁, . . . , e_(N) _(E)     are the components of signal e -   b. r_(k)=Re (w_(a)w*_(b)) or r_(k)=Im(w_(a)w*_(b)), where w_(a) and     W_(b) are formed by construction (a), w_(k)=u_(a)r₀ ^(−α) above, or     are delayed versions, w_(k)=D_(α)u_(a)r₀ ^(−α) for α≥0, of such     constructions; -   c. r_(k)=D_(α)r_(a)+θD_(β)r_(b), where θ∈{1, −1}, a, b∈{1, . . . ,     k−1}, and α, β∈     may depend on k; -   d. r_(k)=(D_(α)r_(a))(D_(α)r_(b)) for all t, a, b∈{1, . . . , k−1}     and α∈     may depend on k; -   e. r_(k)[t]=r_(k)[t−1]+2^(−d)(r_(a)[t]−r_(k)[t−1]) for all t∈     , where a∈{1, . . . , k−1} and d∈     , d>0, may depend on k (equivalently, r_(k) is the response of a     first order linear time invariant (LTI) filter with a pole at     1-2^(−d), applied to r_(a) for some a<k; -   f. r_(k) is the response (appropriately scaled and centered) of a     second order LTI filter with complex poles (carefully selected for     easy implementability)

According to construction (a), the components of e are automatically treated as real signals (i.e., the components of r). Construction (b) presents a convenient way of converting complex signals to real ones while assuring that scaling the input u by a complex constant with unit absolute value does not change the outcome (i.e., phase-invariance). Constructions (c) and (d) allow addition, subtraction, and (if needed) multiplication of real signals. Construction (e) allows averaging of real signals, and construction (f) offers more advanced spectral shaping, which is needed for some PAs which show a second order resonance behavior.

As in the single-band case, the overall distortion term is computed as a sum of N_(k) terms

${\delta \lbrack t\rbrack} = {\sum\limits_{k = 1}^{N_{k}}{{w_{a_{k}}\left\lbrack {t - d_{k}} \right\rbrack}{\sum\limits_{j}{\varphi_{k,j}\left( {r_{j}\lbrack t\rbrack} \right)}}}}$

where the k^(th) term has a selected one of the complex signals indexed by a_(k) and a selected delay d_(k), and scales the complex signal w_(a) _(k) [t−d_(k)] by a sum of the estimated functions of single of the real signals r_(j)[.]. Again, as in the single-band case, the summation over j may omit certain terms (i.e., only relying on a subset of the r_(j)), for example, as chosen by the designer according to their know-how and other experience or experimental measurements. This transformation is implemented by the combination stage 230 in the manner described for the single-band case.

As introduced above, the particular constructions used to assemble the complex signals w_(k) and real signals r_(k) through selections of the sequences of constructions may be based on trial-and-error, analytical prediction of impact of various terms, heuristics, and/or a search or combinatorial optimization to select the subset for a particular situation (e.g., for a particular power amplifier, transmission band, etc.). One possible optimization approach may make use of greedy selection of productions to add to a set of w_(k) and r_(k) signals according to their impact on an overall distortion measure. In such a selection of the terms w_(k) to use in the summation of the distortion term, these terms may be restricted to degree-1 terms.

A number of aspects of the constructions for the complex signals w_(k) are noteworthy. For example, certain cross-terms between bands (e.g., intermodulation terms) do not scale with the power of the individual band terms. Therefore, a possible scaling of a band signal following construction (a) is found to be effective, for example for α=4:

${w_{k} = \frac{u_{i}}{\sqrt[4]{{u_{1}}^{2} + \ldots + {u_{N_{b}}}^{2}}}}.$

Note that in most single-band applications, defining real signals by an “absolute value” formula r_(i)[t]=|u_(q)[t]| may provide better results than a “power” formula r_(i)[t]=|u_(q)[t]|², which may be explained, and justified by experimental observation of the scaling properties of the non-linear harmonics induced by typical power amplifiers (PAs): one can view r_(i)[t]=|u_(q)[t]| as the re-scaled power r_(i)[t]=|u_(q)[t]|² /|u_(q)[t]|. However, this does not work the same way in the multi-band case: defining r₁[t]=|u₁[t]| does not yield the best re-scaling, as compared to the denominator depending on the total signal power, as in r₁[t]=|u₁[t]|²/|u[t]|, where u[t] is the total baseband input (i.e., the sum of all bands). To facilitate proper scaling of real signals, while avoiding aliased harmonics, the original band signals u₁; . . . ; u_(N) _(b) can pass through the re-scaling transformation of construction (a), for example with α=4. Once the rescaling has taken place, it may be more efficient to define real signals according to construction (b), for example as

r _(k)[t]=Re{u _(q)[t]*u _(q)[t−τ]}, or r _(k)[t]=Im{u _(q)[t]*u _(q)[t−τ]}.

Another noteworthy construction of a complex signal uses the (α, β) rotation function of construction (d). In general, in multi-band systems for which the carrier-frequency-to-baseband-spectral-diameter ratio is small enough (say, less than 5), significant high order even inter-band harmonics may be created by a power amplifier. Compensating for those harmonics may require performing higher-order power operations (such as u₁[t]→u₁[t]⁵) on individual band signals. In general, taking a complex number z to positive integer power k means multiplying its phase by k, and taking its absolute value to the k^(th) power. In predistortion applications, the phase manipulation part of the power operation may be significant to the overall performance, while taking the absolute value to the power k may be counterproductive, for example, because it does not match with the harmonic scaling properties of common power amplifiers and also introduces significant numerical difficulties in fixed point implementations. Taking these considerations into account, use of the (α, β) rotation functions has been found effective in practice, for example, in cancelling even harmonics.

As introduced above, restriction to degree-1 complex signals makes the predistorter insensitive to the ultimate carrier frequency, f_(c). More generally, it is not necessary to restrict w_(k) terms that are used to be degree -1. For example, for degree 0 and degree 2 terms, the frequency location of the term within the baseband is not independent of the carrier frequency. To account for this, the complex layer receives an additional complex signal defined as

${e_{c}\lbrack n\rbrack} = {\exp \left( {{j2\pi \frac{f_{c}}{f_{s}}n} + \varphi} \right)}$

for some preferably constant phase ϕ, where f_(c) is the carrier frequency for RF transmission and f_(s) is the baseband sampling frequency for the input signal u[t]. Degree 2 terms w_(k) are multiplied by e_(c) when used in the summation to determine the distortion term, and degree 0 terms are multiplied by e*_(c).

Note that the definition of the e, depends on the ratio f_(c)/f_(s) as well as the initial phase ϕ. Preferably, this signal is generated such that ϕ is equal at the start (n=0) of each transmission frame so that the parameter estimation is consistent with each parameter use. Furthermore, if the frequency ratio is irreducible, for example, f_(c)/f_(s)=7/4, then the signal e_(c) repeats every 4 samples (i.e., e_(c)[0]=e_(c)[4]).

Referring to FIG. 7A, and example of predistortion in a two-band situation is illustrated with narrowband signals that are ultimately transmitted (i.e., as the radio frequency signal p(t)) at frequencies f₁+f_(c) (711) and f₂+f_(c) (712), where f_(c) (701) is the RF carrier frequency. In this example, f₁ is illustrated as negative, and f₂ is illustrated as positive. For example, f_(c)=860.16 MHz, and |f₂−f₁|=190.0 MHz . This example focusses on predistortion to address intermodulation terms such as an 8^(th) order intermodulation term at f₁−Δf=−4f₁+4f₂ (721) and a 10^(th) order term at 2f_(c)+6f₁−4f₂ (722). Other distortion terms (723, 724) are illustrated near f₂. These terms are at frequencies −5f₁+5f₂ and 2f_(c)+5f₁−3f₂, respectively. One way to select these terms is by identifying spectral energy at these frequencies, and determining the corresponding terms that might be responsible for distortion effects at those frequencies.

In this example, the input signal u[t] is represented at a complex sampling rate f_(s)=491.52 MHz (i.e., f_(c)/f_(s)=7/4), for modulation to the range f_(c)−f_(s)/2 to f_(c)+f_(s)/2. Referring to FIG. 7B, the input signal therefore has components u₁ (731) and u₂ (732) at frequencies f₁ and f₂, respectively. Referring to FIG. 7C, the distortion term δ computed as described above, therefore includes terms at frequencies −f_(c)−4f₁+4f₂ (841) and f_(c)+6f₁−4f₂ (842), for the 8^(th) order and 10^(th) order terms respectively.

In this example, to address the 8^(th) order term (841), a complex signal w_(k)=(u*₁u₂)⁴ is used. Such a term corresponds, for example, to application of constructions (a)-(c) above. Without compensation for the carrier frequency, because this is a degree zero term, it would be modulated to frequency f_(c)4f₁+4f₂, rather than to frequency −4f₁+4f₂ . Therefore as discussed above, it is multiplied by e*_(c) yielding a distortion term w_(k)=e*_(c)(u*₁u₂)⁴, which is scaled by the adapted gain, Σ_(i∈Λ) _(k) g_(i)[t]. Similarly, the 10^(th) order term (842) may be addressed using a complex signal w_(k)=u₁ ²(u₁u*₂)⁴, which is a degree 2 term and therefore would be multiplied by e_(c) to yield a term w_(k)=e_(c)u₁ ²(u, u^(*) ₂)⁴ to be scaled by an adapted gain.

In scaling the 8^(th) order term w_(k)=e*_(c)(u*₁u₂)⁴, the following real functions may be used, without limitation:

r ₁ =|u ₁|/√{square root over (|u ₁|² +|u ₂|²)};

r ₂ =|u ₂|/√{square root over (|u ₁|² +|u ₂|²)};

r ₃ =r ₁ +r ₂;

r ₄ =r ₁ −r ₂;

r ₅ =|u ₁|;

r ₆ =|u ₂|; and

r₇=r₅r₆.

Therefore, adapted functions ϕ_(k, j)(r_(j)) for these real functions are used to compute the respective gain terms g_(i).

Referring to FIG. 8, the sampling and periodicity of e_(c) is illustrated for the f_(c)/f_(s)=7/4 situation shown in FIGS. 7A-C. The sampled carrier at the sampling frequency are illustrated with the open circles, illustrating the periodicity of 4 samples.

Therefore, as described above, in both the single and multi-band cases, a configuration of a predistorter involves selection of the sequences of constructions used to form the complex signals w_(k) and real signals r_(j), which are computed at runtime of the predistorter, and remain fixed for the configuration. The parameters of the nonlinear functions ϕ_(k, j)(r), each of which maps from a scalar real signal value r to a complex value, are in general adapted during operation of the system. As described further below, these functions are constructed using piecewise linear forms, where in general, individual parameters only or primarily impact a limited range of input values, in the implementation described below, by scaling kernel functions that are non-zero over limited ranges of input values. A result of this parameterization is a significant degree or robustness resulting from well-conditioned optimizations used to determine and adapt the individual parameters for each of the nonlinear functions.

Very generally, the parameters x of the predistorter 130 (see FIG. 1), which implements the compensation function C, may be selected to minimize a distortion between a desired output (i.e., the input to the compensator) u[.], and the sensed output of the power amplifier y[.]. For example, the parameters x, which may be the values defining the piecewise constant or piecewise linear functions ϕ, are updated, for example, in a gradient-based iteration based on a reference pair of signals (u[.], y[.]), for example, adjusting the values of the parameters such that u[.]=y[.]. In some examples that make use of tables, for example with 2^(S) entries, to encode the non-linear functions ϕ_(k)( ), each entry may be estimated in the gradient procedure. In other examples, a smoothness or other regularity is enforced for these functions by limiting the number of degrees of freedom to less than 2^(S), for example, by estimating the non-linear function as a being in the span (linear combination) of a set of smooth basis functions. After estimating the combination of such functions, the table is then generated.

Therefore, the adaptation section 160 essentially determines the parameters used to compute the distortion term as bit δ[t]=Δ(u[t−τ], . . . , u[t−1]) in the case that τ delayed values of the input u are used. More generally, τ_(d) delayed values of the input and τ_(f) look-ahead values of the input are used. This range of inputs is defined for notational conveniences as q_(u)[t]=(u[t−τ_(d)], . . . , u[t+τ_(f)]). (Note that with the optional use of the terms e[t], these values are also included in the q_(u)([t]) term.) This term is parameterized by values of a set of complex parameters x, therefore the function of the predistorter can be expressed as

v[t]=C(q _(u)[t])=u[t]+Δ(q _(u)[t])

One or more approaches to determining the values of the parameter x that define the function δ( ) are discussed below.

The distortion term can be viewed in a form as being a summation

${\delta \lbrack t\rbrack} = {\sum\limits_{b}{\alpha_{b}{B_{b}\left( {q_{u}\lbrack t\rbrack} \right)}}}$

where the α_(b) are complex scalars, and B_(b)( ) can be considered to be basis functions evaluated with the argument q_(u)[t]. The quality of the distortion term generally relies on there being sufficient diversity in the basis functions to capture the non-linear effects that may be observed. However, unlike some conventional approaches in which the basis functions are fixed, and the terms α_(b) are estimated directly, or possibly are represented as functions of relatively simple arguments such as |u[t]|, in approaches described below, the equivalents of the basis functions B_(b)( ) are themselves parameterized and estimated based on training data. Furthermore, the structure of this parameterization provides both a great deal of diversity that permits capturing a wide variety of non-linear effects, and efficient runtime and estimation approaches using the structure.

As discussed above, the complex input u[t] to produce a set of complex signals w_(k)[t] using operations such as complex conjugation and multiplication of delayed versions of u[t] or other w_(k)[t]. These complex signals are then processed to form a set of phase-invariant real signals r_(p)[t] using operations such as magnitude, real, or imaginary parts, of various w_(k)[t] or arithmetic combinations of other r_(P)[t] signals. In some examples, these real values are in the range [0,1.0] or [−1.0,1.0], or in some other predetermined bounded range. The result is that the real signals have a great deal of diversity and depend on a history of u[t], at least by virtue of at least some of the w_(k)[t] depending on multiple delays of u[t]. Note that computation of the w_(k)[t] and r_(p)[t] can be performed efficiently. Furthermore, various procedures may be used to retain only the most important of these terms for any particular use case, thereby further increasing efficiency.

Before turning to a variety of parameter estimation approaches, recall that the distortion term can be represented as

${\delta \lbrack t\rbrack} = {\sum\limits_{k}{{w_{a_{k}}\left\lbrack {t - d_{k}} \right\rbrack}{\Phi_{k}\left( {r\lbrack t\rbrack} \right)}}}$

where r[t] represents the entire set of the r_(p)[t] real quantities (e.g., a real vector), and Φ( ) is a parameterized complex function. For efficiency of computation, this non-linear function is separated into terms that each depend on a single real value as

${\Phi_{k}\left( {r\lbrack t\rbrack} \right)} = {\sum\limits_{p}{{\varphi_{k,p}\left( {r_{p}\lbrack t\rbrack} \right)}.}}$

For parameter estimation purposes, each of the scalar complex non-linear functions ϕ( ) may be considered to be made up of a weighted sum of the fixed real kernels b_(l)(r), discussed above with reference to FIGS. 4A-D, such that

${\varphi_{k,p}\left( r_{p} \right)} = {\sum\limits_{l}{x_{k,p,l}{b_{l}\left( r_{p} \right)}}}$

Introducing the kernel form of non-linear functions into the definition of the distortion term yields

${\delta \lbrack t\rbrack} = {\sum\limits_{k,p,l}{x_{k,p,l}{w_{a_{k}}\left\lbrack {t - d_{k}} \right\rbrack}{{b_{l}\left( {r_{p}\lbrack t\rbrack} \right)}.}}}$

In this form representing the triple (k, p, l) as b, the distortion term can be expressed as

${{\delta \lbrack t\rbrack} = {\sum\limits_{b}{x_{b}{B_{b}\lbrack t\rbrack}}}},$

where

B _(b)[t]

B _(b)(q _(u)[t])=w _(a) _(k) [t−d _(k)]b _(l)(r _(p)[t])

It should be recognized that for each time t, the complex values B_(b)[t] depends on the fixed parameters z and the input u over a range of times, but does not depend on the adaptation parameters x. Therefore the complex values B_(b)[t] for all the combinations b=(k, p , l) can be used in place of the input in the adaptation procedure.

An optional approach extends the form of the distortion term to introduce linear dependence on a set of parameter values, p₁[t], . . . , p_(d)[t], which may, for example be obtained by monitoring temperature, power level, modulation center frequency, etc. In some cases, the envelope signal e[t] may be introduced as a parameter. Generally, the approach is to augment the set of non-linear functions according to a set of environmental parameters p₁[t], . . . , p_(d)[t] so that essentially each function

ϕ_(k,p)(r)

is replaced with d linear multiples to form d +1 functions

ϕ_(k,p)(r), ϕ_(k,p)(r)p ₁[t], . . . , ϕ_(k,p)(r)p _(d)[t].

These and other forms of interpolation of estimated functions according to the set of parameter values may be used, for example, with the functions essentially representing corner conditions that are interpolated by the environmental parameters.

Using the extended set of(d+1) functions essentially forms the set of basis functions

B _(b)(q _(u)[t])

w _(a) _(k) [t−d _(k)]b _(l)(r _(j)[t])p _(d)[t]

where b represents the tuple (k, p,1, d) and p_(p)=1 .

What should be evident is that this form achieves a high degree of diversity in the functions B_(b)( ), without incurring runtime computational cost that may be associated with conventional techniques that have a comparably diverse set of basis functions. Determination of the parameter values x_(b) generally can be implemented in one of two away: direct and indirect estimation. In direct estimation, the goal is to adjust the parameters x according to the minimization:

$\left. x\leftarrow{{\arg \min}_{x}{{{C(u)} - \left( {v - y + u} \right)}}} \right. = {{argmin}_{x}{\sum\limits_{t \in T}{{{\Delta \left( {q_{u}\lbrack t\rbrack} \right)} - \left( {{v\lbrack t\rbrack} - {y\lbrack t\rbrack}} \right)}}^{2}}}$

where the minimization varies the function Δ(q_(u)[t]) while the terms q_(u)[t], v[t], and y[t] are fixed and known. In indirect estimation, the goal is to determine the parameters x according to the minimization

$\left. x\leftarrow{{\arg \min}_{x}{{{C(y)} - v}}} \right. = {{argmin}_{x}{\sum\limits_{t \in T}{{\left( {{y\lbrack t\rbrack} + {\Delta \left( {q_{y}\lbrack t\rbrack} \right)}} \right) - {v\lbrack t\rbrack}}}^{2}}}$

where q_(y)[t] is defined in the same manner as q_(u)[t], except using y rather than u. Solutions to both the direct and indirect approaches are similar, and the indirect approach is described in detail below.

Adding a regularization term, an objective function for minimization in the indirect adaptation case may be expressed as

${{E(x)} - {\rho {x}^{2}} + {\frac{1}{N}{\sum\limits_{t \in T}{{{e\lbrack t\rbrack} - {\sum\limits_{k}{x_{k}{B_{k}\left( {q_{y}\lbrack t\rbrack} \right)}}}}}^{2}}}},$

where e[t]=v[t]−y[t]. This can be expressed in vector/matrix form as

${E(x)} - {\rho {x}^{2}} + {\frac{1}{N}{\sum\limits_{t \in T}{{{e\lbrack t\rbrack} - {{a\lbrack t\rbrack}x}}}^{2}}}$

where

a[t]=[B ₁(q _(y)[t]), B ₂(q _(y)[t]), . . . , B _(n)(q _(y)[t])].

Using the form, following matrices can be computed:

${G = {\frac{1}{N}{\sum\limits_{t \in T}{{a\lbrack t\rbrack}^{\prime}{a\lbrack t\rbrack}}}}},{L = {\frac{1}{N}{\sum\limits_{t \in T}{{a\lbrack t\rbrack}^{\prime}{e\lbrack t\rbrack}}}}},{and}$ $R = {\frac{1}{N}{\sum\limits_{t \in T}{{{e\lbrack t\rbrack}}^{2}.}}}$

From these, one approach to updating the parameters x is by a solution

x←(ρI _(n) +G)⁻¹ L

where I_(n) denotes an n×n identity. An alternative to performing the inversion is to use a coordinate descent approach in which at each iteration, a single one of the parameters is updated.

In some examples, the Gramian, G, and related terms above, are accumulated over a sampling interval T, and then the matrix inverse is computed. In some examples, the terms are updated in a continual decaying average using a “memory Gramian” approach. In some such examples, rather than computing the inverse at each step, a coordinate descent procedure is used in which at each iteration, only one of the components of x is updated, thereby avoiding the need to perform a full matrix inverse, which may not be computationally feasible in some applications.

As an alternative to the solution above, a stochastic gradient approach may be used implementing:

x←x−ζ(a[τ]′(a[τ]x−e[τ])+ρx)

where ζ is a step size that is selected adaptively and τ is a randomly selected time sample from a buffer of past pairs (q_(y)[t], v[t]) maintained, for example, by periodic updating, and random samples from the buffer are selected to update the parameter values using the gradient update equation above.

A modified version of the stochastic gradient approach, involves constructing a sequence of random variables {tilde over (x)}₀, {tilde over (x)}₁, . . . (taking values in

^(n), n-dimensional complex numbers), defined by

{tilde over (x)} _(k+1′) ={tilde over (x)} _(k) +αa[τ _(k)]′(e[τ_(k)]−a[τ_(k) ]{tilde over (x)} _(k))−αρ{tilde over (x)} _(k),

where {tilde over (x)}₀=0, and τ₁, τ2 ₂, . . . are independent random variables uniformly distributed over the available time buffer, and ρ>0 is the regularization constant from the definition of E=E(x), and α>0 is a constant such that

α(ρ+|a[t]|²)<2

for every t. The expected value x _(k)=E[{tilde over (x)}_(k)] can be proven to converge to

x _(*)=arg min E(x)

as k→∞ An optional additional averaging operation

{tilde over (y)} _(k+1) ={tilde over (y)} _(k)+ϵ({tilde over (x)} _(k) −{tilde over (y)} _(k))

with ϵ∈(0,1] may be used. The difference between {tilde over (y)}_(k) and x_(*) is guaranteed to be small for large k as long as ϵ>0 is small enough. This approach to minimizing E(x) can be referred to as a “projection” method, since the map

x|→x+|a[t]|⁻² a[t]′(e[t]−a[t]x[t])

projects x onto the hyperplane defined by

a[t]x=e[t].

In practical implementations of the algorithm, the sequence of the τ_(k) is generated as a pseudo-random sequence of samples, and the calculations of {tilde over (y)}_(k) can be eliminated (which formally corresponds to ϵ=1, i.e., {tilde over (y)}_(k)={tilde over (x)}_(k−1)). As a rule, this requires using a value of α that results in a smaller minimal upper bound for

α(ρ+|a[t]|²)

(for example, α(ρ+|a[t]|²)< 1 , or α(ρ+|a[t]|²)<0.5). More generally, the values of α and ϵ are sometimes adjusted, depending on the progress made by the stochastic gradient optimization process, where the progress is measured by comparing the average values of |e[τ_(k)]| and |e[τ_(k)]−a[τ_(k)]{tilde over (x)}_(k)|.

Another feature of a practical implementation is a regular update of the set of the optimization problem parameters a[t], e[t], as the data samples a[t], e[t] observed in the past are being replaced by the new observations.

Yet other adaptation procedures that may be used in conjunction with the approaches presented in this document are described in co-pending U.S. application Ser. No. 16/004,594, titled “Linearization System,” filed on Jun. 11, 2018, and published as US2019/0260401A1 on Aug. 22, 2019, which is incorporated herein by reference.

Returning to the selection of the particular terms to be used for a device to be linearized, which are represented in the fixed parameters z, which includes the selection of the particular w_(k) terms to generate, and then the particular r_(p) to generate from the w_(k), and then the particular subset of r_(p) to use to weight each of the w_(k) in the sum yielding the distortion term, uses a systematic methodology. One such methodology is performed when a new device (a “device under test”, DUT) is evaluated for linearization. For this evaluation, recorded data sequences (u[.], y[.]) and/or (v[.], y[.]) are collected. A predistorter structure that includes a large number of terms, possibly an exhaustive set of terms within a constrain on delays, number of w_(k) and r_(p) terms etc. is constructed. The least mean squared (LMS) criterion discussed above is used to determine the values of the exhaustive set of parameters x. Then, a variable selection procedure is used and this set of parameters is reduced, essentially, by omitting terms that have relatively little impact on the distortion term δ[.]. One way to make this selection uses the LASSO (least absolute shrinkage and selection operator) technique, which is a regression analysis method that performs both variable selection and regularization, to determine which terms to retain for use in the runtime system. In some implementations, the runtime system is configured with the parameter values x determined at this stage. Note that it should be understood that there are some uses of the techniques described above that omit the adapter completely (i.e., the adapter is a non-essential part of the system), and the parameters are set one (e.g., at manufacturing time), and not adapted during operation, or may be updated from time to time using an offline parameter estimation procedure.

An example of applying the techniques described above starts with the general description of the distortion term

${\delta \lbrack t\rbrack} = {\sum\limits_{k}{{w_{a_{k}}\left\lbrack {t - d_{k}} \right\rbrack}{\sum\limits_{j}{{\varphi_{k,j}\left( {r_{j}\lbrack t\rbrack} \right)}.}}}}$

The complex signals derived from the input, and the real signals derived from the complex signals are have the following full form:

${{\delta \lbrack t\rbrack} = {{\sum\limits_{k = {- 5}}^{+ 5}\; {{u\left\lbrack {t - k} \right\rbrack}{\sum\limits_{j = {- 5}}^{+ 5}{\varphi_{1,k,j}\left( {{u\left\lbrack {t - k - j} \right\rbrack}} \right)}}}} + {\overset{+ 5}{\sum\limits_{l = {- 5}}}{\overset{1}{\sum\limits_{d = 0}}{\frac{\left( {{u\left\lbrack {t - l} \right\rbrack} + {j^{d}{u\left\lbrack {t - l - 1} \right\rbrack}}} \right)}{2}{\varphi_{2,l,d}\left( \frac{{{u\left\lbrack {t - l} \right\rbrack} + {u\left\lbrack {t - l - 1} \right\rbrack}}}{2} \right)}}}} + {\overset{+ 5}{\sum\limits_{m = {- 5}}}{\overset{+ 2}{\sum\limits_{n = {- 2}}}{{u\left\lbrack {t - m} \right\rbrack}{\varphi_{3,m,n}\left( {{{u\left\lbrack {t - m} \right\rbrack}}{{u\left\lbrack {t - m - n} \right\rbrack}}} \right)}}}}}}$

This form creates a total of 198 (=121+22+55) terms. In an experimental example, this set of terms is reduced from 198 terms to 6 terms using a LASSO procedure. These remaining 6 terms result in the distortion term having the form:

${\delta \lbrack t\rbrack} = {{{u\lbrack t\rbrack}{\varphi_{1,0,0}\left( {{u\lbrack t\rbrack}} \right)}} + {{u\left\lbrack {t - 1} \right\rbrack}{\varphi_{1,1,0}\left( {{u\left\lbrack {t - 1} \right\rbrack}} \right)}} + {\frac{\left( {{u\left\lbrack {t - 4} \right\rbrack} + {{ju}\left\lbrack {t - 5} \right\rbrack}} \right)}{2}{\varphi_{2,4,1}\left( \frac{{{u\left\lbrack {t - 4} \right\rbrack} + {u\left\lbrack {t - 5} \right\rbrack}}}{2} \right)}} + {\frac{\left( {{u\left\lbrack {t + 2} \right\rbrack} + {u\left\lbrack {t + 1} \right\rbrack}} \right)}{2}{\varphi_{2,{- 2},0}\left( \frac{{{u\left\lbrack {t + 2} \right\rbrack} + {u\left\lbrack {t + 1} \right\rbrack}}}{2} \right)}} + {{u\left\lbrack {t - 5} \right\rbrack}{\varphi_{3,5,2}\left( {{{u\left\lbrack {t - 5} \right\rbrack}}{{u\left\lbrack {t - 7} \right\rbrack}}} \right)}} + {{u\left\lbrack {t + 5} \right\rbrack}{{\varphi_{3,{- 5},{- 2}}\left( {{{u\left\lbrack {t + 5} \right\rbrack}}{{u\left\lbrack {t + 7} \right\rbrack}}} \right)}.}}}$

This form is computationally efficient because only 6 w_(k) complex signals and 6 real signals r_(p) terms that must be computed at each time step. If each non-linear transformation is represented by 32 linear segments, then the lookup tables have a total of 6 times 33, or 198 complex values. If each non-linear function is represented by 32 piecewise segments defined by 6 kernels, then there are only 36 complex parameter values that need to be adapted (i.e., 6 scale factors for the kernels of each non-linear function, and 6 such non-linear functions).

The techniques described above may be applied in a wide range of radio-frequency communication systems. For example, approach illustrated in FIG. 1 may be used for wide area (e.g., cellular) base stations to linearize transmission of one or more channels in a system adhering to standard, such as 3GPP or IEEE standards (implemented over licensed and unlicensed frequency bands), pre-5G and 5G New Radio (NR), etc. Similarly, the approach can be implemented in a mobile station (e.g., a smartphone, handset, mobile client device (e.g., a vehicle), fixed client device, etc.). Furthermore, the techniques are equally applicable to local area communication (e.g., “WiFi”, the family of 802.11 protocols, etc.) as they are to wide area communication. Furthermore, the approaches can be applied to wired rather than wireless communication, for example, to linearize transmitters in coaxial network distribution, for instance to linearize amplification and transmission stages (e.g., including coaxial transmission lines) for DOCSIS (Data Over Cable Service Interface Specification) head ends system and client modems. For example, a real high-frequency DOCSIS signal maybe digitally demodulated to quadrature components (e.g., a complex representation) at a lower frequency (e.g., baseband) range and the techniques described above may be applied to the demodulated signal. Yet other applications are not necessarily related to electrical signals, and the techniques may be used to linearize mechanical or acoustic actuators (e.g., audio speakers), and optical transmission systems. Finally, although described above in the context of linearizing a transmission path, with a suitable reference signal representing a transmission (e.g. predefine pilot signal patterns) the approach may be used to linearize a receiver, or to linearize a combined transmitter-channel-receiver path.

A summary of a typical use case of the approaches described above is as follows. First, initial data sequences (u[.], y[.]) and/or (v[.], y[.]), as well as corresponding sequences e[.] and p[.] in implementations that make use of these optional inputs, are obtained for a new type of device, for example, for a new cellular base station or a smartphone handset. Using this data, a set of complex signals w_(k) and real signals r_(p) are selected for the runtime system, for example, based on an ad hoc selection approach, or an optimization such as using the LASSO approach. In this selection stage, computational constraints for the runtime system are taken into account so that the computational limitations are not exceeded and/or performance requirements are met. Such computational requirements may be expressed, for example, in terms computational operations per second, storage requirements, and/or for hardware implementations in terms of circuit area or power requirements. Note that there may be separate limits on the computational constraints for the predistorter 130, which operates on every input value, and on the adapter, which may operate only from time to time to update the parameters of the system. Having determined the terms to be used in the runtime system, a specification of that system is produced. In some implementations, that specification includes code that will execute on a processor, for example, an embedded processor for the system. In some implementations, the specification includes a design structure that specifies a hardware implementation of the predistorter and/or the adapter. For example, the design structure may include configuration data for a field-programmable gate array (FPGA), or may include a hardware description language specific of an application-specific integrated circuit (ASIC). In such hardware implementations, the hardware device includes input and output ports for the inputs and outputs shown in FIG. 1 for the predistorter and the adapter. In some examples, the memory for the predistorter is external to the device, while in other examples, it is integrated into the device. In some examples, the adapter is implemented in a separate device than the predistorter, in which case the predistorter may have a port for receiving updated values of the adaption parameters.

In some implementations, a computer accessible non-transitory storage medium includes instructions for causing a digital processor to execute instructions implementing procedures described above. The digital processor may be a general-purpose processor, a special purpose processor, such as an embedded processor or a controller, and may be a processor core integrated in a hardware device that implements at least some of the functions in dedicated circuitry (e.g., with dedicated arithmetic units, storage registers, etc.). In some implementations, a computer accessible non-transitory storage medium includes a database representative of a system including some or all of the components of the linearization system. Generally speaking, a computer accessible storage medium may include any non-transitory storage media accessible by a computer during use to provide instructions and/or data to the computer. For example, a computer accessible storage medium may include storage media such as magnetic or optical disks and semiconductor memories. Generally, the database (e.g., a design structure) representative of the system may be a database or other data structure which can be read by a program and used, directly or indirectly, to fabricate the hardware comprising the system. For example, the database may be a behavioral-level description or register-transfer level (RTL) description of the hardware functionality in a high-level design language (HDL) such as Verilog or VHDL. The description may be read by a synthesis tool which may synthesize the description to produce a netlist comprising a list of gates from a synthesis library. The netlist comprises a set of gates that also represent the functionality of the hardware comprising the system. The netlist may then be placed and routed to produce a data set describing geometric shapes to be applied to masks. The masks may then be used in various semiconductor fabrication steps to produce a semiconductor circuit or circuits corresponding to the system. In other examples, the database may itself be the netlist (with or without the synthesis library) or the data set.

It is to be understood that the foregoing description is intended to illustrate and not to limit the scope of the invention, which is defined by the scope of the appended claims. Reference signs, including drawing reference numerals and/or algebraic symbols, in parentheses in the claims should not be seen as limiting the extent of the matter protected by the claims; their sole function is to make claims easier to understand by providing a connection between the features mentioned in the claims and one or more embodiments disclosed in the Description and Drawings. Other embodiments are within the scope of the following claims. 

What is claimed is:
 1. A method of signal predistortion for linearizing a non-linear circuit, the method comprising: processing an input signal (u) comprising a plurality of separate band signals (u₁, . . . , u_(N) _(b) ), each separate band signal having a separate frequency range within the input frequency range of the input signal, at least part of the input frequency range containing none of the separate frequency range, the processing producing a plurality of transformed signals (w), the transformed signals including at least one transformed signal equal to a combination of multiple separate band signals; determining a plurality of phase-invariant derived signals (r) to be equal to respective non-linear functions of one or more of the transformed signals; transforming the plurality of phase-invariant derived signals (r) according to a plurality of parametric non-linear transformations (Φ) to produce a plurality of gain components (g); forming a distortion term by accumulating a plurality of terms (k), each term being a combination of a transformed signal (w_(a) _(k) ) of the plurality of transformed signals and respective one or more time-varying gain components (g_(i), i∈Λ_(k)) of the plurality of gain components; and providing an output signal (v) determined from the distortion term for application to the non-linear circuit. 